|
|
Fast Learn++.NSE Algorithm Based on Sliding Window |
SHEN Yan 1, 2, ZHU Yuquan2, SONG Xinping3 |
1.Department of Information Management and Information System, Jiangsu University, Zhenjiang 212013 2.School of Computer Science and Telecommunication Engineering, Jiangsu University, Zhenjiang 212013 3.Department of Electronic Commerce, Jiangsu University, Zhenjiang 212013 |
|
|
Abstract The vote weight of each base-classifier in Learn++.NSE depends on all the error rates in the environments experienced, and the classification learning efficiency of the Learn++.NSE needs to be improved. Therefore, a fast Learn++.NSE algorithm based on sliding window(SW-Learn++.NSE) is presented in this paper. The sliding window is utilized to optimize the calculation of the weight. By only using the recent classification error rates of each base-classifier inside the sliding window to compute the vote weight, the SW-Learn++.NSE improves the efficiency of ensemble classification learning greatly. The experiment shows that the SW-Learn++.NSE achieves a higher execution efficiency with an equivalent classification accuracy compared to the Learn++.NSE.
|
Received: 22 May 2017
|
|
Fund:Supported by National Natural Science Foundation of China(No.61702229,71573107), Natural Science Foundation of Jiangsu Province(No.BK20150531), Jiangsu Planned Projects for Postdoctoral Research Funds(No.1401056C), National Statistical Science Research Project of China(No.2016LY17), Advanced Talent Foundation of Jiangsu University(No.13JDG127) |
About author:: (SHEN Yan(Corresponding author), born in 1982, Ph.D., associate professor. His research interests include data mining, business intelligence and intelligent information system.) (ZHU Yuquan, born in 1965, Ph.D., professor. His research interests include know-ledge discovery, big data mining and complex information system integration and intrusion detection.) (SONG Xinping, born in 1971, Ph.D., professor. Her research interests include big data mining and business intelligence.) |
|
|
|
[1] LEMAIRE V, SALPERWYCK C, BONDU A. A Survey on Supervised Classification on Data Streams // ZIMNYI E, KUTSCHE R O, eds. Business Intelligence. Berlin, Germany: Springer International Publishing, 2014: 88-125. [2] SONG G, YE Y M, ZHANG H J, et al. Dynamic Clustering Forest: An Ensemble Framework to Efficiently Classify Textual Data Stream with Concept Drift. Information Sciences, 2016, 357: 125-143. [3] ELWELL R, POLIKAR R. Incremental Learning of Concept Drift in Nonstationary Environments. IEEE Transactions on Neural Networks, 2011, 22(10): 1517-1531. [4] SHEU J J, CHU K T, LI N F, et al. An Efficient Incremental Learning Mechanism for Tracking Concept Drift in Spam Filtering. PLOS One, 2017, 12(2): e0171518. [5] GOMES H M, BARDDAL J P, ENEMBRECK F. Pairwise Combination of Classifiers for Ensemble Learning on Data Streams // Proc of the ACM Symposium on Applied Computing. New York, USA: ACM, 2015: 941-946. [6] WANG S, MINKU L L, GHEZZI D, et al. Concept Drift Detection for Online Class Imbalance Learning // Proc of the International Joint Conference on Neural Networks. Washington, USA: IEEE, 2013. DOI: 10.1109/IJCNN.2013.6706768. [7] ALIPPI C, BORACCHI G, ROVERI M. Just-in-Time Classifiers for Recurrent Concepts. IEEE Transactions on Neural Networks and Learning Systems, 2013, 24(4): 620-634. [8] SOJODISHIJANI O, RAMLI A R. Just-in-Time Adaptive Similarity Component Analysis in Nonstationary Environments. Journal of Intelligent & Fuzzy Systems, 2014, 26(4): 1745-1758. [9] COHEN L, AVRAHAMI-BAKISH G, LAST M, et al. Real-Time Data Mining of Non-stationary Data Streams from Sensor Networks. Information Fusion, 2008, 9(3): 344-353. [10] TURKOV P, KRASOTKINA O, MOTTL V. Dynamic Programming for Bayesian Logistic Regression Learning under Concept Drift // Proc of the International Conference on Pattern Recognition and Machine Intelligence. Berlin, Germany: Springer, 2013: 190-195. [11] RUTKOWSKI L, PIETRUCZUK L, DUDA P, et al. Decision Trees for Mining Data Streams Based on the McDiarmid's Bound. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(6): 1272-1279. [12] KARIM M R, FARID D M. An Adaptive Ensemble Classifier for Mining Complex Noisy Instances in Data Streams // Proc of the International Conference on Informatics, Electronics & Vision. Washington, USA: IEEE, 2014. DOI: 10.1109/ICIEV.2014.6850838. [13] DITZLER G, ROVERI M, ALIPPI C, et al. Learning in Nonstationary Environments: A Survey. IEEE Computational Intelligence Magazine, 2015, 10(4): 12-25. [14] DEZ-PASTOR J F, RODRGUEZ J J, GARCA-OSORIO C, et al. Random Balance: Ensembles of Variable Prior′s Classifiers for Imbalanced Data. Knowledge-Based Systems, 2015, 85: 96-111. [15] JIANG Y H, ZHAO Q L, LU Y T. Ensemble Based Data Stream Mining with Recalling and Forgetting Mechanisms // Proc of the 11th International Conference on Fuzzy Systems and Knowledge Discovery. Washington, USA: IEEE, 2014: 430-435. [16] KANOUN K, VAN DER SCHAAR M. Big-Data Streaming Applications Scheduling with Online Learning and Concept Drift Detection // Proc of the Design, Automation & Test in Europe Confe-rence & Exhibition. Washington, USA: IEEE, 2015: 1547-1550. [17] SENTHAMILARASU S, HEMALATHA M. Ensemble Classifier for Concept Drift Data Stream // RAJSINGH E, BHOJAN A, PETER J, eds. Informatics and Communication Technologies for Societal Development. Berlin, Germany: Springer, 2014: 127-137. [18] PRADNYA A J, ROSHANI R, DESHMUKH P R. Recursive Ensemble Approach for Incremental Learning of Non-stationary Imba-lanced Data. International Journal of Computer Applications, 2014, 98(17): 41-45. [19] KOLTER J Z, MALOOF M A. Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts. Journal of Machine Lear-ning Research, 2007, 8: 2755-2790. [20] ELWELL R, POLIKAR R. Incremental Learning of Concept Drift in Nonstationary Environments. IEEE Transactions on Neural Networks, 2011, 22(10): 1517-1531. [21] DITZLER G, POLIKAR R. Incremental Learning of Concept Drift from Streaming Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering, 2013, 25(10): 2283-2301. [22] YIN X C, HUANG K Z, HAO H W. DE2: Dynamic Ensemble of Ensembles for Learning Nonstationary Data. Neurocomputing, 2015, 165: 14-22. |
|
|
|